第3講入門：非線形分類の対処

直線で分離できないデータを分類するのに苦戦する線形モデルの限界を超えていきます。今日の目的は、PyTorchのワークフローを使って深層ニューラルネットワーク（DNN）複雑な非線形の決定境界を学習できる複雑な非線形の決定境界現実世界の分類タスクに不可欠です。

1. 非線形データの必要性を可視化する

最初のステップとして、2つの月型分布のような難解な合成データセットを作成し、単純な線形モデルがなぜ失敗するかを視覚的に示します。この設定により、クラスを分けるための複雑な曲線を近似するために、深層アーキテクチャを使用する必要があります。

非線形活性化関数の力

DNNの基本原理は、ReLUなどの関数を通じて隠れ層に非線形性を導入することです。 ReLUこれがないと、層を重ねても深さに関係なく単一の大規模な線形モデルになってしまうためです。

TERMINALbash — classification-env

> Ready. Click "Run" to execute.

TENSOR INSPECTOR Live

Run code to inspect active tensors

Question 1

What is the primary purpose of the ReLU activation function in a hidden layer?

Introduce non-linearity so deep architectures can model curves

Speed up matrix multiplication

Ensure the output remains between 0 and 1

Normalize the layer output to a mean of zero

Question 2

Which activation function is required in the output layer for a binary classification task?

Sigmoid

Softmax

ReLU

Question 3

Which loss function corresponds directly to a binary classification problem using a Sigmoid output?

Binary Cross Entropy Loss (BCE)

Mean Squared Error (MSE)

Cross Entropy Loss

Challenge: Designing the Core Architecture

Integrating architectural components for non-linear learning.

You must build a nn.Module for the two-moons task. Input features: 2. Output classes: 1 (probability).

Step 1

Describe the flow of computation for a single hidden layer in this DNN.

Solution:
Input $\to$ Linear Layer (Weight Matrix) $\to$ ReLU Activation $\to$ Output to Next Layer.

Step 2

What must the final layer size be if the input shape is $(N, 2)$ and we use BCE loss?

Solution:
The output layer must have size $(N, 1)$ to produce a single probability score per sample, matching the label shape.